-
Notifications
You must be signed in to change notification settings - Fork 5.9k
【Quant】support ue8m0 for fp8_quant_blockwise #77153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #77153 +/- ##
==========================================
Coverage ? 0.00%
==========================================
Files ? 1
Lines ? 19
Branches ? 0
==========================================
Hits ? 0
Misses ? 19
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
8196010
|
|
||
| - op: fp8_quant_blockwise | ||
| args: (Tensor x, float epsilon, bool using_1x128_vec_quant, bool input_transpose, bool output_scale_transpose, bool return_transpose_only, bool using_e5m2, bool using_pow2_scale) | ||
| args: (Tensor x, float epsilon, bool using_1x128_vec_quant, bool input_transpose, bool output_scale_transpose, bool return_transpose_only, bool using_e5m2, bool using_pow2_scale, bool using_ue8m0_scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using_pow2_scale 和 using_ue8m0_scale 之间会有影响吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没有影响,using_pow2_scale代表使用2的幂次scale,但是类型仍为float32。using_ue8m0_scale代表不仅仅使用2的幂次scale,并且输出为int32(4个ue8m0)。当两个同时开启时会以using_ue8m0_scale为准。
并且单测中存在笛卡尔积测试样例。两者不会互相冲突
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
作为正式api的话,这些细节需要在api文档里说明,否则使用者只能通过试运行来确定参数作用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,现在是所有attr都没有注释说明,我下一个PR来在python api处为这个算子补充一下完整的注释吧。
PR Category
Operator Mechanism
PR Types
New features
Description
为fp8_quant_blockwise升级支持ue8m0类型scale